Add optimizer to convert min_by/max_by to row number function#25190
Add optimizer to convert min_by/max_by to row number function#25190feilong-liu merged 1 commit intoprestodb:masterfrom
Conversation
dbc5e09 to
f1cd3c3
Compare
presto-main-base/src/main/java/com/facebook/presto/sql/planner/PlanOptimizers.java
Show resolved
Hide resolved
f1cd3c3 to
d928aef
Compare
d928aef to
655ae6e
Compare
|
Maybe I am missing something but In
|
Correct, here is the definition of |
And |
| import static com.facebook.presto.sql.relational.Expressions.comparisonExpression; | ||
| import static com.google.common.collect.ImmutableMap.toImmutableMap; | ||
|
|
||
| public class MinMaxByToWindowFunction |
There was a problem hiding this comment.
Can you add a small comment explaining the plan changes?
There was a problem hiding this comment.
Sure, will add in a separate PR.
There was a problem hiding this comment.
| private int eagerPlanValidationThreadPoolSize = 20; | ||
| private boolean innerJoinPushdownEnabled; | ||
| private boolean inEqualityJoinPushdownEnabled; | ||
| private boolean rewriteMinMaxByToTopNEnabled; |
There was a problem hiding this comment.
I guess row number adds sorting so might not be always efficient but if your performance numbers show other wise then we can make it on by default?
There was a problem hiding this comment.
I want to be conservative for now. Will consider to set it to be true after getting more stats for this optimizer
Description
This optimization converts queries like
to
Here feature1, feature2 are maps. This rewrite can avoid the expensive cost of aggregations on feature1 and feature2. This is commonly used in getting latest features in machine learning workload.
Motivation and Context
Query optimization to reduce cost.
Impact
Query optimization to reduce cost.
Test Plan
Unit tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.